The aimof the paper is to propose a feature fusion based Audio-Visual Speaker Identification (AVSI) system with varied conditions\nof illumination environments.Among the different fusion strategies, feature level fusion has been used for the proposedAVSI system\nwhereHiddenMarkovModel (HMM) is used for learning and classification. Since the feature set contains richer information about\nthe raw biometric data than any other levels, integration at feature level is expected to provide better authentication results. In this\npaper, both Mel Frequency Cepstral Coefficients (MFCCs) and Linear Prediction Cepstral Coefficients (LPCCs) are combined to\nget the audio feature vectors and Active Shape Model (ASM) based appearance and shape facial features are concatenated to take\nthe visual feature vectors. These combined audio and visual features are used for the feature-fusion. To reduce the dimension of the\naudio and visual feature vectors, Principal Component Analysis (PCA) method is used. The VALID audio-visual database is used\nto measure the performance of the proposed system where four different illumination levels of lighting conditions are considered.\nExperimental results focus on the significance of the proposed audio-visual speaker identification system with various combinations\nof audio and visual features.
Loading....